Anomaly Detection for Symbolic Sequences and Time Series Data A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY

نویسنده

  • Varun Chandola
چکیده

This thesis deals with the problem of anomaly detection for sequence data. Anomaly detection has been a widely researched problem in several application domains such as system health management, intrusion detection, healthcare, bioinformatics, fraud detection, and mechanical fault detection. Traditional anomaly detection techniques analyze each data instance (as a univariate or multivariate record) independently, and ignore the sequential aspect of the data. Often, anomalies in sequences can be detected only by analyzing data instances together as a sequence, and hence cannot detected by traditional anomaly detection techniques. The problem of anomaly detection for sequence data is a rich area of research because of two main reasons. First, sequences can be of different types, e.g., symbolic sequences, time series data, etc., and each type of sequence poses unique set of problems. Second, anomalies in sequences can be defined in multiple ways and hence there are different problem formulations. In this thesis we focus on solving one particular problem formulation called semi-supervised anomaly detection. We study the problem separately for symbolic sequences, univariate time series data, and multivariate time series data. The state of art on anomaly detection for sequences is limited and fragmented across application domains. For symbolic sequences, several techniques have been proposed within specific domains, but it is not well-understood as to how a technique developed for one domain would perform in a completely different domain. For univariate time series data, limited techniques exist, and are only evaluated for specific domains, while for multivariate time series data, anomaly detection research is relatively untouched. This thesis has two key goals. First goal is to develop novel anomaly detection techniques for different types of sequences which perform better than existing techniques across a variety of application domains. The second goal is to identify the best anomaly detection technique for a given application domain. By realizing the first goal we develop

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The detection of 11th of March 2011 Tohoku's TEC seismo-ionospheric anomalies using the Singular Value Thresholding (SVT) method

The Total Electron Content (TEC) measured by the Global Positioning System (GPS) is useful for registering the pre-earthquake ionospheric anomalies appearing before a large earthquake. In this paper the TEC value was predicted using the singular value thresholding (SVT) method. Also, the anomaly is detected utilizing this predicted value and the definition of the threshold value, leading to the...

متن کامل

A Novel Ensemble Approach for Anomaly Detection in Wireless Sensor Networks Using Time-overlapped Sliding Windows

One of the most important issues concerning the sensor data in the Wireless Sensor Networks (WSNs) is the unexpected data which are acquired from the sensors. Today, there are numerous approaches for detecting anomalies in the WSNs, most of which are based on machine learning methods. In this research, we present a heuristic method based on the concept of “ensemble of classifiers” of data minin...

متن کامل

Fault Strike Detection Using Satellite Gravity Data Decomposition by Discrete Wavelets: A Case Study from Iran

Estimating the gravity anomaly causative bodies boundary can facilitate the gravity field interpretation. In this paper, 2D discrete wavelet transform (DWT) is employed as a method to delineate the boundary of the gravity anomaly sources. Hence, the GRACE’ satellite gravity data is decomposed using DWT. DWT decomposites a single approximation coefficients into four distinct components: the appr...

متن کامل

Moving dispersion method for statistical anomaly detection in intrusion detection systems

A unified method for statistical anomaly detection in intrusion detection systems is theoretically introduced. It is based on estimating a dispersion measure of numerical or symbolic data on successive moving windows in time and finding the times when a relative change of the dispersion measure is significant. Appropriate dispersion measures, relative differences, moving windows, as well as tec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008